Algorithm independent bounds on community detection problems and associated transitions in stochastic block model graphs
نویسندگان
چکیده
We derive rigorous bounds for well-defined community structure in complex networks for a stochastic block model (SBM) benchmark. In particular, we analyze the effect of inter-community “noise” (inter-community edges) on any “community detection” algorithm’s ability to correctly group nodes assigned to a planted partition, a problem which has been proven to be NP complete in a standard rendition. Our result does not rely on the use of any one particular algorithm nor on the analysis of the limitations of inference. Rather, we turn the problem on its head and work backwards to examine when, in the first place, well defined structure may exist in SBMs. The method that we introduce here could potentially be applied to other computational problems. The objective of community detection algorithms is to partition a given network into optimally disjoint subgraphs (or communities). Similar to k−SAT and other combinatorial optimization problems, “community detection” exhibits different phases. Networks that lie in the “unsolvable phase” lack well-defined structure and thus have no partition that is meaningful. Solvable systems splinter into two disparate phases: those in the “hard” phase and those in the “easy” phase. As befits its name, within the easy phase, a partition is easy to achieve by known algorithms. When a network lies in the hard phase, it still has an underlying structure yet finding a meaningful partition which can be checked in polynomial time requires an exhaustive computational effort that rapidly increases with the size of the graph. When taken together, (i) the rigorous results that we report here on when graphs have an underlying structure and (ii) recent results concerning the limits of rather general algorithms, suggest bounds on the hard phase.
منابع مشابه
Bayesian estimation from few samples: community detection and related problems
We propose an efficient meta-algorithm for Bayesian estimation problems that is based on low-degree polynomials, semidefinite programming, and tensor decomposition. The algorithm is inspired by recent lower bound constructions for sum-of-squares and related to the method of moments. Our focus is on sample complexity bounds that are as tight as possible (up to additive lower-order terms) and oft...
متن کاملThe Geometric Block Model
To capture the inherent geometric features of many community detection problems, we propose to use a new random graph model of communities that we call a Geometric Block Model. The geometric block model generalizes the random geometric graphs in the same way that the well-studied stochastic block model generalizes the Erdös-Renyi random graphs. It is also a natural extension of random community...
متن کاملThe Lovász Theta Function for Random Regular Graphs and Community Detection in the Hard Regime
We derive upper and lower bounds on the degree d for which the Lovász θ function, or equivalently sum-of-squares proofs with degree two, can refute the existence of a k-coloring in random regular graphs Gn,d . We show that this type of refutation fails well above the k-colorability transition, and in particular everywhere below the Kesten-Stigum threshold. is is consistent with the conjecture ...
متن کاملA scalable community detection algorithm for large graphs using stochastic block models
Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of “big data,” traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stag...
متن کاملStochastic Block Model and Community Detection in the Sparse Graphs: A spectral algorithm with optimal rate of recovery
In this paper, we present and analyze a simple and robust spectral algorithm for the stochastic block model with k blocks, for any k fixed. Our algorithm works with graphs having constant edge density, under an optimal condition on the gap between the density inside a block and the density between the blocks. As a co-product, we settle an open question posed by Abbe et. al. concerning censor bl...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- J. Complex Networks
دوره 3 شماره
صفحات -
تاریخ انتشار 2015